Skip to content

Fix naturalsize() rounding rollover at unit boundaries#329

Open
patchwright wants to merge 1 commit into
python-humanize:mainfrom
patchwright:bugfix/naturalsize-rounding-rollover
Open

Fix naturalsize() rounding rollover at unit boundaries#329
patchwright wants to merge 1 commit into
python-humanize:mainfrom
patchwright:bugfix/naturalsize-rounding-rollover

Conversation

@patchwright

Copy link
Copy Markdown

Problem

naturalsize() picks the suffix from the unrounded byte count (exp = int(min(log(abs_bytes, base), ...))), then rounds the mantissa with the format string afterward. When rounding pushes the mantissa up to the base, the already-chosen suffix is left stale:

>>> from humanize import naturalsize
>>> naturalsize(999999)
'1000.0 kB'      # expected '1.0 MB'
>>> naturalsize(999999999)
'1000.0 MB'      # expected '1.0 GB'
>>> naturalsize(1024 ** 2 - 1, binary=True)
'1024.0 KiB'     # expected '1.0 MiB'

999999 bytes is 999.999 kB; "%.1f" rounds that to 1000.0, but the suffix was already fixed at kB, so the output reads 1000.0 kB instead of 1.0 MB. This happens at every decimal, binary, and GNU boundary.

This is distinct from the two related changes already in the repo:

Fix

After rounding, if the mantissa has reached the base (1000 decimal / 1024 binary) and a larger suffix is available, step up one suffix. Values already at the largest suffix (QB/QiB) are unaffected (e.g. the documented naturalsize(10**34 * 3)'30000.0 QB' still holds), as are all values that don't round across a boundary.

Tests

Added regression assertions to test_naturalsize covering decimal, binary, and GNU boundaries. They fail on the current code and pass with this change; every existing assertion and documented example is unchanged (70 → 76 passing).

naturalsize() chooses the suffix from the unrounded byte count via
int(min(log(abs_bytes, base), ...)), then rounds the mantissa with the
format string afterward. When rounding pushes the mantissa up to the base,
the already-chosen suffix is left stale:

    >>> naturalsize(999999)
    '1000.0 kB'           # expected '1.0 MB'
    >>> naturalsize(999999999)
    '1000.0 MB'           # expected '1.0 GB'
    >>> naturalsize(1024 ** 2 - 1, binary=True)
    '1024.0 KiB'          # expected '1.0 MiB'

This is distinct from the ZB->YB top-boundary fix in python-humanize#206 (which added
larger suffixes but did not touch the rounding) and from the metric() fix
in python-humanize#328 (a different function). Here the rollover happens at every small
unit too.

Fix: after rounding, if the mantissa has reached the base and a larger
suffix is available, step up one suffix. Added regression cases to
test_naturalsize (they fail before this change, pass after); all existing
assertions and documented examples are unchanged.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Decimal filesize YB to ZB rollover doesn't happen where expected

1 participant